207 research outputs found
Significant Permission Identification for Android Malware Detection
A recent report indicates that a newly developed malicious app for Android is introduced every 11 seconds. To combat this alarming rate of malware creation, we need a scalable malware detection approach that is effective and efficient. In this thesis, we introduce SigPID, a malware detection system based on permission analysis to cope with the rapid increase in the number of Android malware. Instead of analyzing all 135 Android permissions, our approach applies 3-level pruning by mining the permission data to identify only significant permissions that can be effective in distinguishing benign and malicious apps. Based on the identified significant permissions, SigPID utilizes classification algorithms to classify different families of malware and benign apps. Our evaluation finds that only 25% of permissions (34 out of 135 permissions) are significant. We then compare the performance of our approach, using only 25% of all permissions, against a baseline approach that analyzes all permissions. The results indicate that when Support Vector Machine (SVM) is used as the classifier, we can achieve over 90% of precision, recall, accuracy, and F-measure, which are about the same as those produced by the baseline approach. We also show that SigPID is effective when used with 67 other commonly used supervised learning approaches. We find that 55 out of 67 algorithms can achieve F-measure of at least 85%, while the average running time can be reduced by 85.6\% compared with the baseline approach. When we compare the detection effectiveness of SigPID to those of other approaches, SigPID can detect 96.54% of malware in the data set while other approaches detect 3.99% to 96.41%.
Advisers: Witawas Srisa-an, Qiben Ya
Significant Permission Identification for Android Malware Detection
A recent report indicates that a newly developed malicious app for Android is introduced every 11 seconds. To combat this alarming rate of malware creation, we need a scalable malware detection approach that is effective and efficient. In this thesis, we introduce SigPID, a malware detection system based on permission analysis to cope with the rapid increase in the number of Android malware. Instead of analyzing all 135 Android permissions, our approach applies 3-level pruning by mining the permission data to identify only significant permissions that can be effective in distinguishing benign and malicious apps. Based on the identified significant permissions, SigPID utilizes classification algorithms to classify different families of malware and benign apps. Our evaluation finds that only 25% of permissions (34 out of 135 permissions) are significant. We then compare the performance of our approach, using only 25% of all permissions, against a baseline approach that analyzes all permissions. The results indicate that when Support Vector Machine (SVM) is used as the classifier, we can achieve over 90% of precision, recall, accuracy, and F-measure, which are about the same as those produced by the baseline approach. We also show that SigPID is effective when used with 67 other commonly used supervised learning approaches. We find that 55 out of 67 algorithms can achieve F-measure of at least 85%, while the average running time can be reduced by 85.6\% compared with the baseline approach. When we compare the detection effectiveness of SigPID to those of other approaches, SigPID can detect 96.54% of malware in the data set while other approaches detect 3.99% to 96.41%.
Advisers: Witawas Srisa-an, Qiben Ya
MetaAgents: Simulating Interactions of Human Behaviors for LLM-based Task-oriented Coordination via Collaborative Generative Agents
Significant advancements have occurred in the application of Large Language
Models (LLMs) for various tasks and social simulations. Despite this, their
capacities to coordinate within task-oriented social contexts are
under-explored. Such capabilities are crucial if LLMs are to effectively mimic
human-like social behavior and produce meaningful results. To bridge this gap,
we introduce collaborative generative agents, endowing LLM-based Agents with
consistent behavior patterns and task-solving abilities. We situate these
agents in a simulated job fair environment as a case study to scrutinize their
coordination skills. We propose a novel framework that equips collaborative
generative agents with human-like reasoning abilities and specialized skills.
Our evaluation demonstrates that these agents show promising performance.
However, we also uncover limitations that hinder their effectiveness in more
complex coordination tasks. Our work provides valuable insights into the role
and evolution of LLMs in task-oriented social simulations
Private Model Compression via Knowledge Distillation
The soaring demand for intelligent mobile applications calls for deploying
powerful deep neural networks (DNNs) on mobile devices. However, the
outstanding performance of DNNs notoriously relies on increasingly complex
models, which in turn is associated with an increase in computational expense
far surpassing mobile devices' capacity. What is worse, app service providers
need to collect and utilize a large volume of users' data, which contain
sensitive information, to build the sophisticated DNN models. Directly
deploying these models on public mobile devices presents prohibitive privacy
risk. To benefit from the on-device deep learning without the capacity and
privacy concerns, we design a private model compression framework RONA.
Following the knowledge distillation paradigm, we jointly use hint learning,
distillation learning, and self learning to train a compact and fast neural
network. The knowledge distilled from the cumbersome model is adaptively
bounded and carefully perturbed to enforce differential privacy. We further
propose an elegant query sample selection method to reduce the number of
queries and control the privacy loss. A series of empirical evaluations as well
as the implementation on an Android mobile device show that RONA can not only
compress cumbersome models efficiently but also provide a strong privacy
guarantee. For example, on SVHN, when a meaningful
-differential privacy is guaranteed, the compact model trained
by RONA can obtain 20 compression ratio and 19 speed-up with
merely 0.97% accuracy loss.Comment: Conference version accepted by AAAI'1
- …